Session 13: Prosody
نویسندگان
چکیده
Tho aim of this introductory s~tion is to set the context for Session 13: Prosody. It will do so by defining some basic terms, by considering the status of current research on prosody, and by outlining the papers in the session and how they contribute to and complement previous work in the area. Prosody, perceptually, can be thought of as the relative temporal groupings of words and the relative prominence of certain syllables within these groupings. Acoustic correlates of prosody include patterns of relative duration of segments and of silences, fundamental frequency, amplitude, and "vowel color." Phonological variation is related in part to prosodic structure and is sometimes also considered part of prosody. The bulk of the research on prosody, in the literature as well as in this session, is focused on the primary acoustic correlates of prosody, namely patterns of fundamental frequency and duration. It is appropriate to end this workshop on this topic, since prosody, perhaps more than any other area in spoken language systems, requires the involvement of both speech and natural language. Prosody can provide for natural language a source of acoustic information bearing on higher linguistic levels. Further, this information is largely unrep-resented in text. One of the questions addressed in this session is: How can the acoustic attributes of prosody be transmitted not just to a speech recognizer (which is used to interpreting acoustic information), but also to natural language understanding components? The temporal (grouping) aspect of prosody appears to be related to the syntactic structure of an utterance, and one could imagine a component that would pass temporal information to a parser. The prominence aspect of prosody appears to be related to the semantic and discourse/pragmatic structure, and one could imagine a component that would pass prominence information to these levels. However, both grouping and prominence relationships are involved to some extent in all linguistic levels, and a more complex architecture is required. Prosody is an area ripe for further research since it requires the integration of information from all levels, from the acoustics through morphology, syntax, semantics and prag-matics. Few, if any, researchers are comfortable in all these areas, and the work requires collaboration across traditional divisions among disciplines, as these papers show. In this session, in contrast to historical trends in prosody research, there is a focus on using statistical and corpus-based techniques. Further, in this session, these techniques are used …
منابع مشابه
9th ISCA Workshop on Speech Synthesis
s 10 Keynote Session 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Oral Session 1: Prosody. . . . . . . . . . . . . . . . . . . . . . . . . 12 Poster Session 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Keynote Session 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Oral Session 2: Deep Learning in Speech Synthesis . . . . . . . . . 25 Demo Session . . . ...
متن کاملSocial and Linguistic Speech Prosody
NOTE: page numbers refer to the digital form of the proceedings where full papers are included these can be downloaded from http://www.speechprosody2014.org/proceedings.pdf 1 Day One May 20th Tuesday Opening Session 1:30pm 2pm : 1-0-opening (3 bros:welcome!etc) 1.1 Tuesday Session One 2pm 3:30pm : 1-1-plenary (1+3 presentations)
متن کاملSession 11: Prosody
This paper provides a brief introduction to prosody research in the context of human-computer communication and an overview of the contributions of the papers in the session. In large part, prosody is "the relative temporal groupings of words and the relative prominence of certain syllables within these groupings" (Price and Hirsehherg [1]). This organization of the words, as Silverman points o...
متن کاملDisplaying prosodic text to enhance expressive oral reading
This study assessed the effectiveness of software designed to facilitate expressive oral reading through text manipulations that convey prosody. The software presented stories in standard (S) and manipulated formats corresponding to variations in fundamental frequency (F), intensity (I), duration (D), and combined cues (C) indicating modulation of pitch, loudness and length, respectively. Ten e...
متن کاملApa: an Object Oriented System for Automatic Prosodic Analysis
.....................................................................................................................................................7 List of Figures ...........................................................................................................................................9 List of Tables .............................................................................
متن کامل